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Abstract — In this paper we develop a statistical mechanical 
interpretation of the noiseless source coding scheme based on an 
absolutely optimal instantaneous code. The notions in statistical 
mechanics such as statistical mechanical entropy, temperature, 
and thermal equilibrium are translated into the context of 
noiseless source coding. Especially, it is discovered that the 
temperature 1 corresponds to the average codeword length of an 
instantaneous code in this statistical mechanical interpretation 
of noiseless source coding scheme. This correspondence is also 
verified by the investigation using box-counting dimension. Using 
the notion of temperature and statistical mechanical arguments, 
some information-theoretic relations can be derived in the man- 
ner which appeals to intuition. 



I. Introduction 

We introduce a statistical mechanical interpretation to the 
noiseless source coding scheme based on an absolutely optimal 
instantaneous code. The notions in statistical mechanics such 
as statistical mechanical entropy, temperature, and thermal 
equilibrium are translated into the context of noiseless source 
coding. 

We identify a coded message by an instantaneous code with 
an energy eigenstate of a quantum system treated in statistical 
mechanics, and the length of the coded message with the 
energy of the eigenstate. The discreteness of the length of 
coded message naturally corresponds to statistical mechanics 
based on quantum mechanics and not on classical mechanics. 
This is because the energy of a quantum system takes discrete 
value while an energy takes continuous value in classical 
physics in general. Especially, in this statistical mechanical 
interpretation of noiseless source coding, the energy of the 
corresponding quantum system is bounded to the above, and 
therefore the system has negative temperature. We discover 
that the temperature 1 corresponds to the average codeword 
length of an instantaneous code in the interpretation. This 
correspondence is also verified by the investigation based on 
box-counting dimension. 

Note that, we do not stick to the mathematical strictness 
of the argument in this paper. We respect the statistical me- 
chanical intuition in order to shed light on a hidden statistical 
mechanical aspect of information theory, and therefore make 
an argument on the same level of mathematical strictness as 
statistical mechanics. 



II. Instantaneous codes 

We start with some notation on instantaneous codes from 
information theory [9], [1], [3]. 

For any set S, #S* denotes the number of elements in S. 
We denote the set of all finite binary strings by {0, 1}*. For 
any s £ {0, 1}*, |s| is the length of s. We define an alphabet 
to be any nonempty finite set. 

Let X be an arbitrary random variable with an alphabet Ti 
and a probability mass function px (x) ~ Pr{X — x}, x £ 7i. 
Then the entropy H(X) of X is defined by 

H (X) = - J2 Px(x) \og Px (x), 

where the log is to the base 2. We will introduce the notion 
of a statistical mechanical entropy later. Thus, in order to 
distinguish H(X) from it, we particularly call H(X) the 
Shannon entropy of X. A subset S of {0, 1}* is called a prefix- 
free set if no string in S is a prefix of any other string in S. An 
instantaneous code C for the random variable X is an injective 
mapping from H to {0, 1}* such that C(H) = {C(x)\x £ H} 
is a prefix-free set. For each x £ H, C(x) is called the 
codeword corresponding to x and |C(x)| is denoted by l{x). A 
sequence x\, X2, ■ ■ ■ , xn with Xi £ His called a message. On 
the other hand, the finite binary string C(xi)C(x2) ■ ■ ■ C(xn) 
is called the coded message for a message x%, X2, . . . , xn- 

An instantaneous code play an important role in the 
noiseless source coding problem described as follows. Let 
Xi, X2, ■ ■ ■ , Xn be independent identically distributed ran- 
dom variables drawn from the probability mass func- 
tion px{x). The objective of the noiseless source coding 
problem is to minimize the length of the binary string 
C(x\)C(x2) ■ ■ ■ C(xjv) for a message x%, X2, ■ ■ ■ , xn gener- 
ated by the random variables {Xi} as N — + 00. For that 
purpose, it is sufficient to consider the average codeword 
length Lx(C) of an instantaneous code C for the random 
variable X, which is defined by 



L X (C)= Y,Px{x)l{x) 



independently on the value of N. We can then show that 
Lx(C) > H(X) for any instantaneous code C for the 
random variable X. Hence, the Shannon entropy gives the 
data compression limit for the noiseless source coding problem 
based on instantaneous codes. Thus, it is important to consider 



the notion of absolutely optimality of an instantaneous code, 
where we say that an instantaneous code C for the random 
variable X is absolutely optimal if Lx{C) = H(X). We can 
see that an instantaneous code C is absolutely optimal if and 
only if px(x) = 2-^) for allied 

Finally, for each x N = {x\, x 2 , ■ ■ ■ , x N ) e H N , we define 
Px{x N ) as p x (xi)p x (x 2 ) ■ ■ -Px{x N )- 

III. Statistical Mechanical Interpretation 

In this section, we develop a statistical mechanical inter- 
pretation of the noiseless source coding by an instantaneous 
code. In what follows, we assume that an instantaneous code 
C for a random variable X is absolutely optimal. 

In statistical mechanics [7], [11], [8], we consider a quantum 
system <S to tai which consists in a large number of identical 
quantum subsystems. Let TV be a number of such subsystems. 
For example, N ~ 10 22 for 1 cm 3 of a gas at room tem- 
perature. We assume here that each quantum subsystem can 
be distinguishable from others. Thus, we deal with quantum 
particles which obey Maxwell-Boltzmann statistics and not 
Bose-Einstein statistics or Fermi-Dirac statistics. Under this 
assumption, we can identify the ith quantum subsystem Si 
for each i = 1, . . . , N. In quantum mechanics, any quantum 
system is described by a quantum state completely. In statis- 
tical mechanics, among all quantum states, energy eigenstates 
are of particular importance. Any energy eigenstate of each 
subsystem Si can be specified by a number n = 1,2,3,..., 
called a quantum number, where the subsystem in the energy 
eigenstate specified by n has the energy E n . Then, any 
energy eigenstate of the system <S tota i can be specified by an 
iV-tuple (m, n 2 , . . . , njv) of quantum numbers. If the state 
of the system 5 to tai is the energy eigenstate specified by 
(ni,ri2, . . . , njv), then the state of each subsystem Si is the 
energy eigenstate specified by rij and the system <S to tai has 
the energy E ni + E n2 + • • • + E nN . Then, the fundamental 
postulate of statistical mechanics is stated as follows. 

Fundamental Postulate: If the energy of the system S totii \ 
is known to have a constant value in the range between E 
and E + 5E, where 8E is the indeterminacy in measurement 
of the energy of the system <S tota i, then the system <S tota i is 
equally likely to be in any energy eigenstate specified by 
(m,ri2, . . . , njv) such that E < E ni + E ri2 + ■ ■ 
E + SE. 



E nN < 



Let Sl(E, N) be the total number of energy eigenstates of 
<5 to tai specified by (m, n 2 , . . . ,njv) such that E < E ni +E n2 + 
■ ■ ■ + E nN < E + SE. The above postulate states that any 
energy eigenstate of <S to tai whose energy lies between E and 
E + SE occurs with the probability l/fl(E,N). This uniform 
distribution of energy eigenstates whose energy lies between E 
and E + SE is called a microcanonical ensemble. In statistical 
mechanics, the entropy S{E 1 N) of the system <S to tai is then 
defined by 

S(E,N) = klnQ{E,N), 

where k is a positive constant, called the Boltzmann Con- 
stant, and the In denotes the natural logarithm. Note that, 



in statistical mechanics, the entropy S(E, N) is normally 
estimated to first order in N and E. Thus the magnitude of 
the indeterminacy SE of the energy does not matter unless it 
is too small. The temperature T(E, N) of the system <S to tai is 
defined by 

T(E, N) ~ dE { ' h 

Thus the temperature is a function of E and N . The average 
energy e per one subsystem is given by E/N. 

Now we give a statistical mechanical interpretation to the 
noiseless source coding scheme based on an instantaneous 
code. Let X be an arbitrary random variable with an al- 
phabet H, and let C be an absolutely optimal instantaneous 
code for the random variable X. Let X\, X 2 , ■ ■ ■ , Xn be 
independent identically distributed random variables drawn 
from the probability mass function px{x) for a large N, say 
N ~ 10 22 . We relate the noiseless source coding based on C 
to the above statistical mechanics as follows. The sequence 
Xi, X 2 , ■ ■ ■ , Xx corresponds to the quantum system <S to tai, 
where each Xi corresponds to the ith quantum subsystem 
St. We relate x E TL, or equivalently, C(x) to an energy 
eigenstate of a subsystem, and we relate l(x) = \C(x)\ to an 
energy E n of the energy eigenstate of the subsystem. Then a 
sequence (x\, . . . , Xn) S H n , or equivalently, a finite binary 
string C(xi) ■ ■ ■ C(xx) corresponds to an energy eigenstate of 
Stotai specified by (m, . . . , njv). Thus, l(x\) + ■ ■ ■ + 1(xn) = 
\C{x\) ■ ■ ■ C(a;jv)| corresponds to the energy E ni +■ ■ - + E nN 
of the energy eigenstate of iS to tai. 

We define a subset C{L,N) of {0,1}* as the set of all 
coded messages C(x\) ■ ■ ■ C(xx) whose length lies between 
L and L + SL. Then Q(L, N) is defined as #C(L, N). 
Therefore N) is the total number of coded messages 
whose length lies between L and L + SL. We can see that 
if C( Xl )---C{x N ) G C(L,N), then 2~ L < p(x N ) < 
2-(£+<5i) -pjjjg j s b ecause C is an absolutely optimal instan- 
taneous code. Thus all coded messages C(x\) ■ ■ ■ C(xn) E 
C(L,N) occur with the probability 2~ L . Note here that we 
care nothing about the magnitude of SL, as in the case of 
statistical mechanics. Thus, given that the length of coded mes- 
sage is L, all coded messages occur with the same probability 
1/Q,(L, N). We introduce a micro-canonical ensemble on the 
noiseless source coding in this manner. Thus we can develop 
a certain sort of statistical mechanics on the noiseless source 
coding scheme. 

The statistical mechanical entropy S(L, N) of the instanta- 
neous code C is defined by 

S(L,N) = \ogSl(L,N). 

The temperature T(L,N) of C is then defined by 

1 = dS_ 

T(L, N) ~ 8L [ ' >' 

Thus the temperature is a function of L and N. The average 
length A of coded message per one codeword is given by L/N. 
The average length A corresponds to the average energy e in 
the statistical mechanics above. 



IV. Properties of Statistical Mechanical Entropy 

In statistical mechanics, it is important to know the values 
of the energy E n of subsystem Si for all quantum numbers n, 
since the values determine the entropy S(E, N) of the quan- 
tum system 5 t otai- Corresponding to this fact, the knowledge 
of l(x) for all x £ H is important to calculate S(L,N). We 
investigate some properties of S(L, N) and T(L, N) based on 
l(x) in the following. 

As is well known in statistical mechanics, if the energy of a 
quantum system <S to tai is bounded to the above, then the system 
can have negative temperature. The same situation happens in 
our statistical mechanics developed on an instantaneous code 
C, since there are only finite codewords of C. We define 
l m i n and Zmax as mm{l(x) | x £ H} and max{/(x) | x £ 
Ti], respectively. Given N, the statistical mechanical entropy 
S(L, N) is a unimodal function of L and takes nonzero value 
only between Nl m [ n and Nl mRX . Let Lo t> e the value L which 
maximizes S(L, N). If L < L then T(L, N) > 0. On the 
other hand, if L > Lo then T(L. N) < 0. The temperature 
T(L, N) takes ±00 at L = L . 

According to the method of Boltzmann and Planck (see 
e.g. [11]), we can show that 

S(L ) N) = NH(G(C ) T(L,N))) ) (1) 

where G(C, T) is the random variable with the alphabet TL and 
the probability mass function Pg(c,t)( x ) = Pr{G(C, T) = x } 
defined by 

2 -l(x)/T 
PG(C,T)(x) = = o-I(„)/T- 

The temperature T(L, N) is implicitly determined through the 
equation 

— = K x )Pg(c,t(l,n))( x ) (2) 

as a function of L and N. These properties of S(L, N) and 
T(L, N) are derived only based on a combinatorial aspect of 
S(L,N). 

Now, let us take into account the probabilistic issue given by 
the random variables X\, X2, ■ ■ ■ , Xjy. Since the instantaneous 
code C is absolutely optimal, a particular coded message of 
length L occurs with probability 2~ L . Thus the probability 
that some coded message of length L occurs is given by 
2- L n(L, N). Hence, by differentiating 2~ L tt(L 1 N) on L and 
setting the result to 0, we can determine the most probable 
length L* of coded message, given N. Thus we have the 
relation 

JL{-L + S(L,N)} =0, 

Oh (L,N)=(L*,N) 

which is satisfied by L*. It follows that T(L*,N) = 1, Thus, 
the temperature 1 corresponds to the most probable length 
L*. On the other hand, p G (C,i)0) = ^ l(x) at T(L*,N) = 1, 
and therefore, by ©, we have L*/N = H(X) = L X (C). 
Since C is absolutely optimal, this result is consistent with the 
law of large numbers. Thus, the temperature 1 corresponds to 



the average codeword length Lx(C), which is equal to the 
average length A of coded message per one codeword at the 
temperature 1. 

V. Thermal Equilibrium between Two 
Instantaneous Codes 

Let X 1 be an arbitrary random variable with an alphabet 
Ti 1 , and let C 1 be an absolutely optimal instantaneous code 
for the random variable X 1 . Let X\, X^, . . . , XL, be indepen- 
dent identically distributed random variables drawn from the 
probability mass function Px i { x ) f° r a large ^V 1 - On the other 
hand, let X 11 be an arbitrary random variable with an alphabet 
7i u , and let C 11 be an absolutely optimal instantaneous code 
for the random variable X 11 . Let Xf, A^, 1 , . . . , X 1 ^ be inde- 
pendent identically distributed random variables drawn from 
the probability mass function p x « (x) for a large iV 11 . 

Consider the following problem: Find the most probable 
values L 1 and L n , given that the sum L 1 + L 11 of the length 
L 1 of coded message by C 1 for the random variables {Xj} 
and the length L 11 of coded message by C 11 for the random 
variables {X^ 1 } is equal to L. 

In order to solve this problem, the statistical mechanical 
notion of "thermal equilibrium" can be used. We first note that 
a particular coded message by C 1 of length L\ and a particular 
coded message by C 11 of length Ln occur with probability 
2-L, 2 -in _ 2~ L , since the instantaneous codes C 1 and C n 
are absolutely optimal. Thus, any particular pair of coded 
messages by C 1 and C 11 occurs with an equal probability, given 
that the total length of coded messages for {Xj} and {X^ 1 } 
is L. Therefore, the most probable allocation L{~ and L\\ of 
L = L\ + L\\ maximizes the product fli(Li, Ni)D,ji(Lji, iVn). 
We see that this condition is equivalent to the equality: 

T I (L*,N 1 ) = T 11 (L^N 11 ), 

where the functions T\ and Tn are the temperature of C 1 and 
C n , respectively. This equality corresponds to the condition 
on the thermal equilibrium between two systems, given a 
total energy, in statistical mechanics. Using (0, the value of 
Ti(Lj , Ni) = T U (L^, N u ) is obtained by solving the equation 
on T: 

N l 

— \ c1 ( x )\pg(C,t)( x ) + 

xEH 1 

— \ Cn ( X )\PG(C*T)( x ) 
x£H a 

= 1. 

Then, again by (O, the most probable values L\ and L\ Y are 
determined. 

VI. Dimension of Coded Messages 

The notion of dimension plays an important role in fractal 
geometry [6]. In this section, we investigate our statistical 
mechanical interpretation of the noiseless source coding from 
the point of view of dimension. Let F be a bounded subset 
of K, and let N n (F) be the number of 2~"-mesh cubes that 



intersect F, where 2 _,l -mesh cube is a subset of M in the form 
of [m2~ n , (m+ 1)2 - ™] for some integer m. The box-counting 
dimension dims F of F is then defined by 



dim b F = lim 



logiV„(F) 



Let {0, 1}°° = {6i& 2 6 3 • • • | bi = 0, 1 for all i = 1, 2, 3, . . . } 
be the set of all infinite binary strings. In [10] we investigate 
the dimension of sets of coded messages of infinite length, 
where the number of distinct codewords is finite or infinite. 
In a similar manner, we investigate the set of coded messages 
of infinite length by an absolutely optimal instantaneous code 
C. 

By (O, the ratio L/N is uniquely determined by temperature 
T. Thus, by letting L,N — > oo while keeping the ratio 
L/N constant, we can regard the set C(L, N) as a subset of 
{0, 1}°°. This kind of limit is called the thermodynamic limit 
in statistical mechanics. Taking the thermodynamic limit, we 
denote C(L, N) by F(T), where T is related to the limit value 
of L/N through ©. Although F(T) is a subset of {0, 1}°°, 
we can regard F(T) as a subset of [0, 1] by identifying 
a G {0, 1}°° with the real number O.a. In this manner, we can 
consider the box-counting dimension dims F(T) of F(T). 

We investigate the dependency of dims F(T) on tempera- 
ture T with — oo < T < oo. First it can be shown that 



dim B F(T) 



= lim 
= lim 



L 

S(L,N) 



where the limits are taken while satisfying (fJJ for each T. 
Thus the statistical mechanical entropy S(L, N) and the box- 
counting dimension dim b F(T) of F(T) are closely related. 
By ([TJ and (f2]), we can obtain, as an explicit formula of T, 

dim B F(T) = i + — L log y; 2-'W/r j (3) 



T A(T) 



where A(T) is defined by 



A(T) 



We define the "degeneracy factors" d m i n and d max of the lowest 
and highest "energies" by d m i n = #{x G H | l(x) = l m i n } and 
^max = #{ x G 7i | Z(x) = / max }, respectively. Note here that 
since C is assumed to be absolutely optimal, Y^xen 2^ 1 ^ = 
1 and therefore <i max can be shown to be an even number. In 
the increasing order of the ratio L/N (i.e. A(T)), we see from 
® that 



lim diniR F(T) = 
dim B F(l) = 



logd n 



lim dims F(T) = , , 

T^±oo 



nlogn 



lim dims F(T) 



x< - n l[x) 

log rfmax 
/max 



We can show that nlogn < J2 x eH un l ess a H codewords 
have the same length, and obviously logd m i n // m i n < 1 and 
log<i ma x/Z max < 1 except for such a trivial case. Thus, 
in general, the dimension dims F(T) is maximized at the 
temperature T = 1. This can be checked using (f3]) based on 
the differentiation of dims F(T). That is, we can show that, if 
all codewords do not have the same length, then the following 
hold: 



0) 



— dim B F(T) 



T=Tn 



= if and only if T Q = 1, 



(ii) — dim B F(T) <0. 
al z t=i 

Note that all coded messages C(x\)C{x2) ■ • ■ of infinite 
length form the set {0,1}°° and therefore the interval [0,1], 
since C is an absolutely optimal instantaneous code. Thus, 
since dims F(l) is equal to dims[0, 1], the set F(l) is as 
rich as the set [0, 1] in a certain sense. This can be explained 
as follows. Since L/N = Lx(C) at the temperature T = 1, 
as seen in Section |IV] by the law of large numbers, the length 
of coded message for a message of length TV is likely to 
equal NLx(C), for a sufficiently large N. Thus -F(l) contains 
almost all finite binary string of length L. In other words, F(l) 
consists in coded messages for all messages which form the 
typical set in a sense. 

VII. Conclusion 

In this paper we have developed a statistical mechanical 
interpretation of the noiseless source coding scheme based 
on an absolutely optimal instantaneous code. The notions in 
statistical mechanics such as statistical mechanical entropy, 
temperature, and thermal equilibrium are translated into the 
context of information theory. Especially, it is discovered 
that the temperature 1 corresponds to the average codeword 
length Lx(C) in this statistical mechanical interpretation of 
information theory. This correspondence is also verified by the 
investigation using box-counting dimension. The argument is 
not necessarily mathematically rigorous. However, using the 
notion of temperature and statistical mechanical arguments, 
several information-theoretic relations can be derived in the 
manner which appeals to intuition. 

A statistical mechanical interpretation of the general case 
where the underlying instantaneous code is not necessarily 
absolutely optimal is reported in another work. 
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